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ABSTRACT 

A broad but parametrically simple model for a stationary sequence 
of dependent discrete random variables is given and several submodels are 
discussed. The structure of the model is specified by the marginal dis- 
tribution of the random variables and several other parameters. The 
sequence of random variables is formed by a probabilistic linear com- 
bination of independent, identically distributed discrete random variables 
and is in general not Markovian. Second-order joint moments and spectra 
are obtained for the model, as well as some properties for the lengths 
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only two values is useful as a model for the counting process in a 
discrete-time point process. An application to the modelling of errors 
in the transmission of binary data is briefly discussed. 
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1. INTRODUCTION 



In this paper we will introduce a simple method for obtaining 
a stationary sequence of dependent raindom variables having a specified 
marginal distribution and correlation structure (second-order joint 
moments). One advantage of the model is that the specification of these 
two aspects of the model is independent. Another advantage is that the 
sequence is obtained as a very simple transformation of a sequence of 
independent random variables. The model is analogous to models for 
dependent sequences of exponential random variables introduced in Jacobs 
and Lewis (1977) and Lawrance and Lewis (1977). 

We will now define some quantities which will be used throughout 
the paper. Let (Y^) be a sequence of independent random variables taking 
values in a discrete space E each having the distribution tt. Let 
(U^} and (V^) be independent sequences of independent random variables 
taking the values 0 and 1 with 

(1.1) P{U^ = 1} = P and P{V^ = 1} = P 

for fixed 0 < P < 1 and 0 < p < 1. Let (S^) be a sequence of 
independent identically distributed random variables taking the values 
0,1,2, ...,N with distribution F, where N is a fixed integer. 

The most general case which we will consider here is a sequence 
of random variables which is formed according to the probabilistic 

linear model 



1 



( 1 . 2 ) 



X„ = U Y _ + (1 - U )A 

n n n-S n i 

n 



n-(N+l) 



for n = 1,2 



. 



y where 



(1.3) 




The model of (1.2) and (1.3) will be temed DARMA(l, IfH) , (discrete 
mixed autoregressive-moving average process with autoregression of 
order 1 and moving average of order N+1) . 



n ^ 1,2,..., will be shown to form a stationary sequence of dependent 
discrete random variables having marginal distribution v. This 
stationary sequence is in general not Markovian, although it will be 
so if p = 0. Its correlation structure is determined by the parameters 
p and p and the distribution F. Note that tt can be any distribution . 
Some cases of discrete distributions of particular interest are obtained 
by choosing tt to be geometric or Poisson. 

Certain special cases of the DARMA(l,N+l) process are of 
particular interest and their consideration will make the nomenclature 
clear. 

(i) The MR(l) process. 

If P = 0, then 



If we start the process with A having the distribution 

TT independent of (Y^; n > -N}, ^^n^' then the X^’s, 




^n-(N4-l)-l probability p 



n-(N+l) 

2 



(1.4) 



with probability (l-p) 



{A^] is called the DAR(l) process (discrete autoregressive process of 
order 1) . 



(ii) The DMA(N) process 

If 3=1, then ~ k probability F(k) for 

” n 

k = 0,1, ...,N where F is the distribution of S . In this case, 

n 

{X^] is called a DMA(N) process (discrete moving average process of 
order N) . Note that if is a DMA(l) process, then 



X 



n 



with probability F(0), 

Yn with probability 1 - F(0). 



(iii) The DARMA(l, 1) process 

Finally, if N = 0, then (X^) will be termed a DARMA(l,l) 
process (discrete mixed autoregressive-moving average process both of 
order 1) with parameters 3 and p; that is. 



X 



n 




with probability 3, 
with probability (l-3) • 



Note that the DMA(1) process is a special case of the DARMA(1,1) 
process when p = 0. 



(iv) Independent process 

If 3 = 1, N = 0 or 3 = 0, P = 0, then {X^] is a sequence 
of independent random variables with common distribution tt. 

The model of (l.2) and (I.5) is really the backward DARMA model. 
The forward model is defined in a similar fashion. However, the two, 
while similar, are not necessarily equivalent. This is because {X^} 



3 



is not in general time reversible in the sense that {X^, will 



not in general have the same distribution as (X X ... , X . 

The properties of one model can be derived by the same techniques as 
those of the other, so we will only consider the backward model. 

Note that the DARMA process may be defined using any sequence 

of independent identically distributed random variables 

necessarily discrete. However, (l.2) and (l.3) show that (X ) is 

n 

obtained as a mixture of the sequence. As a result, even if 

the distribution of is continuous, a realization of the sequence 

{Xj^} will in general contain many runs of a single value. This seems 
to be the major drawback to using this scheme to obtain a sequence of 
dependent random variables with a specified continuous marginal distri- 
bution and correlation structure. Other schemes for obtaining sequences 
of dependent exponential and gamma random variables have been proposed 
which look more promising j cf. Lawrance and Lewis (l977)> Gaver and 
Lewis (1977) > and Jacobs and Lewis (1977)* 

One motivation behind the DARMA models was to provide a simple 
scheme for obtaining models with which to analyse stationary sequences 
of dependent discrete random variables with specified marginal distri- 
bution and correlation structure. In general, there is not much beyond 
a Markov chain model which is overparametrized for statistical purposes 
for modelling dependent sequences of random variables. In addition it 
is very often simple to show from data that the correlation structure 
of the sequence is not Markovian. The DARMA model can be used to model 
tionMarkovian sequences of discrete random variables^ an observed 
sequence of this kind is discussed in the last section. 

4 



Another motivation for the development of this process was to 
provide models for point processes in which the data is given in terms 
of counts in fixed time intervals rather than the exact times of arrivals. 
Most models for point processes beyond the Poisson process are most 
easily described in terms of times of arrivals or times between arrivals 
and it is often hard to obtain results concerning the Joint distribution 
of counts in different fixed time intervals. We feel that the DARMA 
models will be of use in such situations. There is also the possibility 
of modelling directly the binary counting process in a discrete time 
point process. This is discussed in Section 6. 
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2. SOME PRELIMINARY PROPERTIES OF THE DARMA(l,N+l) PROCESS 

In this section we will give some properties for the DARMA(l,N+l) 

process. Unless otherwise indicated we will assume throughout the paper 

that has a distribution tt and is independent of 

(V ), and (S ). 
n n 

2.1, The marginal distribution of X^. 

We will first show that as defined by (1.2) has distribution 

TT for all n. To this end we note from the expression (1.3) that the 
random variable can be expanded backwards to the initial value 

A to give A^ = j with probability p^(l-p) for 0 < j < N+n 

and A^ = A with probability that is, A^ is a mixture 

of y^, ^n-1^ ■ * ‘ ^ ^ (N+1) ‘ ^ state 

space E 

N+n . ^ . 

p{A^=i) = ^ p*^(i-p) + 

for n = -N, -N+1, ... . Similarly, 

N N 

P{Y^ o =i) = S = 7r(i) Z P{S =j) = Tr(i) 

""■^n j=0 " J=0 

for i £ E and n = 1,2,... . From (1.2) and (1.3) it now follows that 
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P(X^=il = ^ = ’'(1) 



for i € E and n = 1, 2, ... . Hence, the marginal distribution of 

the X ' s like those of the Y ' s is tt". 
n n 

2.2. Correlational properties of (X^) 

Although the X^' s have a stationary distribution tt, the X^' s 

are not independent, as are the s. This can be seen by the following 

calculation of the covariance between X and X ... After some 

n n+j 

simplification 





y 



n-S 



n 






+ (1-p)^ ^®t\+j-(N+l) \-(N+l)^ ■ ®^\+o-(N+l)^ ®^\-(N+l)^^ 



The covariance of Y 



n-S, 



and Y 



n+l-S 



is 



n 



n+1 




n 



N 



k=0 n 



Z P(S.^^^=k) P(S^-k^l) Var . 
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In the case p = 1, (the DMA(N) model), only the first term in (2.1) 
is nonzero and we can get the correlation from the above result. 
Putting F(k) = P{S^=k) we have for the correlation 



( 2 . 2 ) 






(1) . oorr(Y , y ) 

n n+1 



n+1 n n+1 n n 

N-1 

= E F(k) F(k+1) . 
k=0 



By similar reasoning, for j < N 



(2.3) P^(3) - ccrr(Y „.3 ,Y 3 ) = z' F(« 

n n+o k=0 



and for j > N, Pj^(d) = 0- Note that these expressions do not depend on 

n and thus the DMA.(N) process is second order covariance stationary. 

We will now compute the covariance of A and A , which 
^ n n-1 

appears in (2.1); this will incidentally give us the correlation structure 
of the DAR(1) process. 



- E[A„J E[A„_^: 

+ (1-p) {E(Y^A^_j^] - E(Y^] E[Ajj_j^l) 



= p Var A - , 
n-1 
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are independent. 



since the second term is zero because Y and A , 

n n-1 

This is because A , is a function only of Y , , Y By an 

n-1 n-1 n-2' 

induction argument we obtain for the correlation of A and A ^ . 

n n+j 

(2.4) p^(j) = corr(A^,A^^j) = 

for J > 1. Because of the assumption that has distribution v, 

(2.4) does not depend on n and thus the autoregressive process is 

second-order covariance stationary. 

To con^)lete the result for the general DARMA(l, N+1) process, 

we compute the cross covariances between the sequences {Y _ } and 

n—o 

n 

{A }. We obtain 
n 



(2.5) 




for 


j > 1 since 


0 < d 


< N 


(2.6) 


E[Y o A 

n-S n-N+j 



n+j 



] - E[Y ^ ] E[A 

'■ n-S^ n-N+j 



N 



k=N-o 



:[(l-p) F(N-j) + p(l-p) F(N-j+l) + ••• + p^(l-p) F(NJ] Var(Y^) 
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For 0 > N 



(2.7) 



of 



We have 
( 2 . 8 ) 
For 1 ; 
(2.9) 






N 



- ^ Vr V»-o> V'J - 



R..0 



n-k' n 



n-IH-j- 



= [p^"^(l-p) F(0) + p^“^'*'^(l-p) F(l) + ••• + p^(l-p) F(N)]var(Yj^). 



Putting everything together in (2.1) we obtain the correlation 

eind X ^ , 
n+J 



P^^(d) = (E[X„X^,^] - E[XJ E[X^^^])Aar X^ 






= E F(k) F(k+1) + p(l-p) F(N) (1-p) + (I-P)^p • 
k=0 



0 < N 



Pf^(o) 

= ^ "L F(k) F(k+j) +p(l-p)(l-p){F(N-d+l) +pF(N-J+2)+...+ p^"^(N)) 
k=0 

+ (l-p)^p^ . 
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For j > N+1 



(2.10) = P^'"^"^{0(1-P)(1-p)[F(O) + pF(1) + ... + p^(N)] + (1-P)2 p^+l) 

Note that 0 < Pj^(j) < 1 and for o > N, Pj^(j) decreases 
geometrically if p > 0 and p < 1. Since Pj^(j) is independent of n, 
the DARMA(1,N+1) process is second order covariance stationary. 



2,3. Invariance under transformations 

From its definition we note that X is a mixture of the random 

n 

variables Y^, ••• > Y_j^ and i.e., it is a random 

selection of one and only one of these random variables. Thus, if we 

transform each of the random variables Y^, I-* ^ ^ ^ (N+1) 

the same function, each will be transformed in the same way and 

its distribution will be that of the transformed Y 's, 

n 

Similar remarks apply if we transform the Note that in 

applying a common transformation individually to the we do not 

affect the selection procedure and therefore the correlation structure 
of the transformed process is the same as that of the untransformed 
process. This (marginal) transformation invariance is important for 
statistical analysis of the process. 
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5. THE AUTOREGRESSIVE PROCESS DAR(l) 

In this section we will give some properties of the MR(l) 
process {A^} . As usual we will assume that A has distribution ir. 

By the results of Section 2, (A^) is a stationary sequence of random 
variables with marginal distribution tt and correlations 



(3.1) 



PaCj) = corr(A ,A ..) 




d > 1. 



The spectnam of the process is thus 



(3.2) f(oj) = ^ {1 +■ 2 E p (j) cos(o)j)} = ^ p ~ 

It follows from (1.3) that {A^} is a Markov chain; that is, 
P{A^^^ = i|A^, ...,A^} = ^ state space E. 

Further, it is not hard to show that the transition matrix P is 
given by 



(3.3) 



■’(Vl 



/ 



i|A^ = k) = P(k,i) = i 



(l-p)Tr(i), for k i, 
p+ (l-p)7r(i) for k = i. 



\ 

Note that we have started from the opposite direction from that 
usually taken in Markov chain theory; we have specified the stationary 
distribution associated with the chain first and specified the (Markovian) 
dependency structure by a single parameter p. Moreover changing p 
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does not affect tt. When p = 0 we have a stationary sequence of 
independent identically distributed random variables with distribution ir. 

The fact that is a Markov chain with a particularly siit 5 )le 

transition function P makes many calculations quite easy. For example, 
in discrete time series, mns of given values of the random vaxiables 
are useful in statistical analyses. Properties of these runs are 
easy to obtain for the M.R(l) process. Thus fix a state i £ E and 
let T^ = inf{n > ItA^ i) ” is the length of a run of i's 

starting at time 1, where length can be 0,1,... . Then 

P{T^ > n) = P{A^ = Ag = ••• = A^ = i) = 7 )(i) P(i,i)’^"^ 



for n > 1 and P{T^=0] = 1 - 7r(i). Thus 



b.h) 



E[T.] 



7r(i) 7r(i) 

1 - P(i, i) (1-p) [1 - 7T(i)] 



If p = 0, then A = Y for n > 1 and E[T. ] = 7 r(i) /[l-Tr(i) ] 
as expected since {A^] is a sequence of independent random variables 
in this case. Note that for 0 < p < 1 



E[T^] 







that is, the expected length of a run of i 's for a DAR(l) process is 
always greater than or equal to the expected run length for a sequence 
of independent random variables. Moreover the inflation in the expected 



15 



length of runs is uniform for all states. This is a consequence of 
the fact that we are dealing with a one parameter Markov chain. 

It is also not haird to calculate the generating function for 
T^. We have for 0 < z < 1 



$(z) = Z z”p{T.=n] 
n=0 ^ 



[l-7r(i)] + E z"[P(i,i)]“"^ [1-P(i,i)] 
n=l 



= [l-Tr(i)] 



+ z[l-P(i.i)] y(i) 
1 “ zP(i,i) 



[l-Tr(i)][l-zp] 

1 - zTr(i) - zp[l-Tr(i)] 



Again, if p = 0, <I>(z) reduces to the expression for the 

generating function of a length of run of i for a sequence of 
independent random variables with marginal distribution it. 
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4. THE DMA, PROCESS 



In this section we will consider the DMA(n) process X = Y „ . 

n n-S 

n 

Note that, unlike the DAE(i) process, the DMA(n) process is not Markovian 
in general. 

4.1. Correlation properties. 

By results in section 2, is a stationary sequence of random 

variables with marginal distribution t and correlations 

N-j N 

(4.l) ^mA^*^^ ^ ^ F(k+j) = Z f(v) F(v-J ^ 

for 1 < j < N. Also = 0 for j > N and = i- Note 

that when N = 1, the maucimum value of the first order serial correlation, 

max p.,. (l) = meix {F(o) [1-F(o)])= 1/4. In fact one can show that 
F(0) ^ F(0) 

for any N > 1 the maximum first order serial correlation 
that can be achieved is l/4. One can also maximize the correlation 
at any point, say j, by making F(j) = F(o) = 1/2. However, all the 
other correlations are zero. 

For the spectrum of the DMA(n) process we have 
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( 4 . 2 ) 



f(“) = ^ 2 ^ cos(o)J); 



j=l 



m 



then if we define p (-j) = p (j), we have 

MA. MA. 



(4.3) 



f((X)) 



27T 



■^•00 . , N 

Z Z f(v) F(v-j) 

j = -oo v=|j| 



2T 



[(Z F(j)) ( Z e^^^^FCj)) + 1 - Zf(j)^] 

J=0 j=0 j=0 



2TT 



[<Pg(0)) <Pg(-0>) + 1 - 



N 

^ F(j)^] 

j=0 



-i[l<PgWl" * 1 



I f(j)2] 

J=0 



where cpg is the characteristic function of the distribution F of 
the random variable S. Thus we can model a broad class of spectra 
f(^i>). If f(o) = 1 we have an independent identically distributed 
sequence and a flat (constant) spectrum. 

By way of example, it is worth noting that we have restricted 
S to have finite support. Then (4.3) is a polynomial in cos just 
like any moving average process. The finite support was necessary 
to allow inclusion of the autoregressive tail (l.2). If one does not 
want to add this tail, then there is no reason to restrict the range 
of S. One then gets a much broader class of models for which (4.3) 
in particular holds, although the model is still a rajidom index model. 
This extended model is not as broad as the DARMA(1,N+-1) model in 
the sense that one cannot, as in linear models (see for example. 
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Anderson, l97o) represent the tail (I. 3 ) as a random index model in 
which the random indices for each n are independent random variables. 

To continue with the exajnple, in the extended moving average 
model let S have a geometric distribution 



Then 



and 



P(S=j} = F(j) = pCl-p)-^ j = 0,1,... . 



<Pg(^) 



1 



P 

(l-p)e 



f (oi) 




d-p)e^“] 



[1 



P 

(l-p)e”^^] 



+ 



1 



E p^(i-p)^'^ 
0-0 



i_ ^ 2 p(l-p) 

[1+ (l-p)^ - 2(l-p) cos a>] [1- (l-p)^] 



The initial point in the spectrum is related to the amount of 
long term dependence there is in the process. One could measure this 
by an index of dispersion (Cox and Lewis, 1966, p. 7l) 

Var (X + • • • + X ) 

J = — 

^ k{E[X]f 

and 

lim J = 2TT ^ f (0+) . 
k oo ^ E[X] 

For the moving average process, f(o+) = ^ [2 - ^_q 
which takes values between l/2ir and w. To compare the moving 
average process to the DAR(i) we note that from (3-2) for the DAR(1 process 
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f(CH-) = ^ [ (l-p^)/ (l-p)^] = ~ ' Ci^pJ ^hich is always greater than 1/27T 
if p > 0 and increases with p to infinity. Note that f (0+) 
for a sequence of independent random variables is 1/2TT. Thus both 
the moving average process and the DAE process give more long term 
dependence than a sequence of independent identically distributed 
random variables. The DAR(i) process allows more long term dependence 
than the moving average process. 



h.2. Joint distributions and time reversibility 

Unless otherwise indicated we will restrict our attention to 
the DMA.(l) process in the remainder of this section} that is, if 
a = P{S^=0], then 



( 4 . 8 ) 



Y 

n 



X =( 



n 






n-1 



with probability a 
with probability (l-a)» 



In this case = a(l-a) and = 0 J > 2. 

It is not hard to calculate the Joint Laplace-Stielt jes 

transfoms of the Joint distributions of random variables in the 

-sX 

DMA,(l) sequence but it is tedious. For example, if r(s) = E[e '^], 
then from ( 4 . 8 ) 



= E[exp{-s^Xj^- 

= Cl-a) r(sj^) r(s ) + a(l-a) r(s^+s ) + Jr(s^) r(sg) 

= r(sj^) 1 ( 3 ^) [1 - a(l-a)] + a(l-a) r(s^ + s^)- 
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Similarly, by conditioning arguments we obtain 



'^^(si^s^^s^) = E[exp - s^X^] ] 

= (i-a) r(s^) + a(i-a) + s^) r(s^) 

+ a^(i-a) r(s^) r(s^ + s^) + oP r(s^) r(s^) r(s^) 

= r (sj^) r(s 2 ) r(s^) [i - 2 aii- 2 )] 

+ [r(s^) ris^ + s^) + r(s^ + s^) r(s^)] a/(i-a) 

and 

tj^(s^,S2,s^,Sj^) = E[exp{-Sj^Xj^ - s^X^ - s^X^ - Sj^Xj^} ] 

= (l-a) r(sj^) + a(i-a) r(s^ + s^) 

+ a^(l-a^) r(s^) r (s^+s^). r (sj^) * oP(i-a) r(sj^) rCs^) r(s^+Sj^) 

+• aV(sj^) r(s2) r(s^) r(si^) 

= r(s^) r(s 2 ) r(s^) r(sj^)[l-3a(l-a) + a^(i-a) -of^(l-a)] 

+ r(s^) rCsg) r (s^+Sj^) [a(i-a) “ a^(i-a) + o^(i-a) ] 

+ r(s^) rCs^ + s^) r(s2^) a(l-a) 

+ r(sj^ + s^) r(sj) r(s2^) [a(l-a) - a^(l-a) + a^(l-a)] 

p p 

+ r(s^ + s^) r(s^ + a (i-a) 

One interest in the joint distributions of the random variables 
is to look at the time reversibility of the process. One reason for 
concern with time-reversibility is the following. The EMA(i) process 
(exponential moving average of order l) of Lawrance and Lewis (lS^77) 



19 



is not time reversible even though this fact cannot be determined 

from second-order properties of the process. Consequently one 

cannot distinguish between a and (l-a) in the spectrum of the 

EMAl process. However, by using higher order moments it is possible 

to distinguish between a and (l-a)* For the DMA(i) process the 

fact that p^(l) = a(l-a) means we cajinot use it to distinguish 

between a and (l-a)* The time reversibility for the DMA(i) process 

would mean that we might not be able to distinguish between a and 

(l-a) even by using higher order moments. 

Since \lr (s ,s . ,s ) = (s ,s , . . . ,s ) for n = 2,3A, 

n L d n n n n-1 1 

it seems likely that the DMA(i) process is time reversible. In order 

to show time reversibility we need to show that ilf (st,s„,...,s ) = 

\^n(Sn>Sn ^,...^s^) for any nonnegative s^, . . . , s^. For simplicity 

consider the terms a = r(s,) r(s^ + s,) ir(s,. ) ... r(s ) and 

12 3^ n 

b = r(s,) t(s_) ... r(s ,) r(s « s .) r(s ) of ^ (s,,s.,...,s ). 

1 2 n-3 n-2 n-1 n n 1 2 n 

In the expression for * * ^^n^ term a has a coefficient 

of 

n 

Z PfS =0 or 1, S =0, S,= +1, S, = 1,...,S.=1, S.. =0,...,S =0] 

2 3 4 J J+1 n ^ 

J ^ 

n 

= E P{S^=0, 3^=0, S^= +1, Sj^= +1,...,S^.= +1, Sj^^=0,...,S^=0} 
n 

+ E P{S^= +1, '*'1^ . . . >S^=0] 



since the event associated with the term a is that 

the same Y. and that all the other X. 's pick distinct 
J 1 

each other. Similarly^ the term b has a coefficient of 



and X^ pick 
Yj ' s from 
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n-2 

Z P{S =0 or 1, S -1, S ^=0, S ,=0,.. 
^ n n-1 n-2 n-3 



,S =0, 
J 







Since the independent and only take the values 0 and 1 

and the independent; the coefficients of the terms a and b 

of ^ ^ ^n^ equal. 

Similar arguments can be used to show that the coefficients 
of the terms 









n r(s ) r(s. +s ) n r(s ) r(s ) 

i=l '^1 i=2 ^2 ^2 






n 



n r(s. , . ) r(s, + s. ) r r(s.t 

i=2 ‘^k-l ^ ^k ^ ^ 

k 



and 



jl~2 



32 - 01,-2 



n r(s .) ir(s . ,,+ s ) n (s . ) r(s . + s ) 

n-i n-j^+1 n-jj^ n-j^+1 n-.i^ 



i=0 






n-jj^-l 



n r(s . .) r(s . + s . ) n y(s . . 

i=l *^"^k-l“^ ”'^k ^ ""'^k i=l ’^■^k" 



of 'if (s,,...,s ) are equal for any sequences of integers 
n 1 n 



1 < ^2 ^^2^^ < J3 < " • < jj, " 

and k > 1. Thus ' ^n^ '*'n^®n^ ’ " ^ for all n. 

Hence the DMA(i) process is time reversible. 
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U-3* Run lengths for the DMA.(l) process. 

We will now consider length of runs for a DMA.(l) process. 

Fix a state i in E and let = inf{n > l:X^?^i)-l the length 

of a run of i initiated at time 1 where length can be 0,1,... . 

We will first compute E[T.]. Let a. = 1 and a = PfX, =X^= = X =i'l 

1 0 n 1 2 n ^ 

for n > 1. Then 



a^ = P{X^=i} 

a^ = P{X^=X 2 =i) = (l-a) 7r(i)a^ + a(l-a) 7r(i)aQ + ir(i)^ 



and by induction 



Vi ■ ^ 

p p 

= (l-a) 7r(i)a^ + a(l-a) ir(i)a^_j^ + a (l-a) 7r(i) + "• 

+ a*^(l-a) ir(i)”aQ + a”^^ 'ir(i)”^^ . 



Thus 



E[Tj = 2 a 

1 -1 u 

n=l 



oo ^ w 

= 7r(i) + (l-a) 'rr(i) E a^ + a(l-a) 7r(i) ^ 



n=l 



I-OTtU) , n 
n=l 

* ° 1:5^ “0 



= 7r(i) + 



7r(i)^ + arr(i) 



1 - o;ir(i) 



- g 






E[T^] 



Solving the last equation for E[T^] we obtain 



22 



(it.9) 



E[T.] 




where 



(4.10) 



p(i) = 7r(i)[l + a(l-a)] - 7r(i)^ a(l-a' 
= 7r(i) + o:(l-a) 7i(i)[l - 7r(i)] . 



If a is either 0 or 1 , then {X^) is a sequence of independent 
random variables and E[T^] = 7r(i) /(l-7r(i) ) as expected. Note that 
E[T^] > Tr(i)/[l-7)(i) ] for 0 < a < 1; that is, the expected length 
of a run of i for a DMA(l) process is greater than the expected 
length of a run for a sequence of independent random variables. 

For a given distribution tt, the maximum value for E[T^] occurs when 
a = 1 / 2 . In this case 



We now turn our attention to the generating function of T^. 
Fix j i in the state space E and let 






and 




Using an induction argument we obtain for n > 1 
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= (1-a) 7r(i) b^_^ + a7r(l) (l-a)V2 V 5 

+ ••• + a” ^(l-a)b^ + a^7r(i)” cnr(j) . 

Thus 



Y, z\ {z(l-o:)Tr(i) + (l-a)z V a%(i)“z") 

n=0 “ ^ n=0 " n=l 



+ Cbr(«j) X. o:^'n'(i)’^z’^ 

n=l 






After some simplification we obtain 



(4.11) 



'f(z) = Y z"P{T.=n) = — [ l-Tr(l) ] ^ 

n=0 1 - zTr(i) - z Tr(i) [ 1 -t; (i) ] a(l-o:) 



Note that for a = 0 or 1, <t(z)=[l-7r(i)]/[l-zir(i) ] as expected. 
Higher order moments of the run lengths can be obtained from (4.11). 
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5. THE BINARY MRMA(l,li PROCE- ^ 



In this section we will consider a DARI-lA(i, 1 process in 
which takes only the values 0 and 1; that is, 



(5.1) 



with probability p, 
A^ ^ with probability (l-p) 



and 



A^ ^ with probability p, 



with probability (l-p’'. 



V 

where {Y^) is a sequence of independent random variables taking the 
values 0 and 1 with common distribution tt. Note that the MRMA(l,l) 
process is not Markovian in general. 

Time series of binary random variables axe oL particular 
importance for modelling the differential counting process in 
discrete time point processes. Klotz (1973' and Kanter (1975' 
have given a model which is different from the binary MRMA(l,l' 
process. 



Setting N = 0 in (p. 8' -(2. 10) we obtain the correlation 

of X and X , J > 1, 
n n+j - 



Pj^(j^ = corr(X^,X^_^j) = p(l-p) (l-o + 
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The spectrum of the process is thus 



00 



(5.2) f(cjj) = ^ {1 + 2 H cos(oij)) 




where 



c(p,p) = p(l-p)d-p) + (l-p)^p . 



We will now consider some properties of lengths of rxins. 

For fixed i € (0,1) > let T^ = inf{n > ItX^ 7^1) "1 as before. 

We will calculate E[T^^] and the generating function for T^. To 
begin, note that although (X^) is not a Maxkov chain, n=l,2, . 

is a Markov chain. For i, i, j, k in (0,1) 



independent of i. Letting denote the matrix whose (i, j) entry 

is Q^(i,J) we have 



P{A„.,i=j, = k|A^= i, X^= d = 



" p(l-p) + [l-p(l-p)] 7 t( 0) (l-p)(l-p)7r(l) 



‘'0 



P(l-p) 7r(0) 



Pp 7 t(0) 



and 



PP 7t(1) 



P(l-p) 7 t(1) 



Q. 



(l-p)d-p) 7 t( 0) (l-^)p + [1-(1-P)p] Tt(1) 
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Note that P{Tq > n|A^ = i) = Qj(i,0) + Qj(i,l) = 
for i =0,1. Hence 

00 

E[TQ|A^=i] = ^ qJ( 1,E) = R (i,E) - 1 
n=l 

where Rj^(i,j “ZlnrO with 1=0^ being the identity matrix 

and R^(i,E) = RQ(i,0) + RQ(i,l). It is not hard to show that 



(1-3) (l-p) 7r(l) 

1 - p(l-3) - .1 - p(l-3)] .-(0/, 

with 

A - detd-Q^) = [l-7r(0)]{l-p(l-3)[l-^ir(0)]-3TT(0)[l-3(l"P)]} • 

Thus 

( 5 . 3 ) 

E[Tq] = 7t(0) Rq(0,E) + 7t(1) Rq(1,E) - 1 

[l-7r(odll-p(l-3)[l-3 7r(0)] - 3^(0) [1 - 3(l-p) ] ) 
Tr(0){l+corr(X^,X^^^)+3p-p] + tt( 0)^ { -corr(Xj^,Xj^+ 3 _) - + p) 

l-p(l-3) + Tr(0){-l+2p-33p-corr(X^,X^^^)) + -rlO)"^! corr(X^,X^^^, 

7r(0)(corr(X^,X^^^)+l-oP3P-7r(0Ucorr(X^,Xj^+3_) + 23 p-p)) 

" [1 - 77(0)]{l-p+3P-7r(0)(corr(X^,X^^^) + 23p-p)] 

after some simplification. 

Similarly one can show that 



where 



(l-Q )“^ = - 
^ ^0 ^ A 



^1 - 3p 7t(0) 
3(l-p) v(0) 



230-3' 
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(5.4) 






7T(l)fljPp + p(l-p-3+2Pp)[l-7r(l)]) 

](l-p(l-p)[l-p¥ - p7r(inl-P(l-p)]) 

+ corr(X^,X^^^)+pp-p) + 7r(l)^{-corr(X^,X^^^)-2pp + p) 

1- p ( 1-P ) + TT ( 1 ) { - 1 + 2 P-3P p-corr (X^, X^^^ ) } + 7? ( 1 ) ^ { corr (x^, X^^^ ) +2Pp- p) 

7r(l) (corr(X^,X^^^)+l-p+pp-7r(l) (corr(X^,X^^^) +2pp-p)} 

[l-7r(l)]{l-(:rfpp-7;(l) (corr(X^,X^_^^) + 2pp-p)l 



Note that the expected length of a run of i for the binary DAEMA(l, 1) 

process is always greater than or equal to the expected length of a 

run of i for the independent case. 

We now turn our attention to the computation of 
00 

$^(z) z^{T^=n) for i = 0,1 and 0 < z < 1. To begin, note that 

P{TQ=n|AQ=i] “ QqCj^E)] . 

6 

Thus 

z”p{T =nlA =i) =2 E z”Q"(i,o) [1 - Q^(j,E)] . 
n=0 ^ ^ j n=0 ^ 

It is not hard to show that = (l - z<^) where 



(I-zQq)"^ 




1 - zPp tt(0) 
zp(l-p) tt(0) 



z(l-P)d-p) 7 t(1) 

1 - z{p(l-p) + [l-p(l-p)] tt(0)} 



with 
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a(z) = 1 - zp(l-p) + Z7t(0) { -p-l+p(l-p) ) 



+ z^7((0)[1-tt(0)]{-P(1-P) +2pP( 1-P)1 + z^-f,(0)Sp . 



Af^er some manipulation we obtain 



1>o(z) 



l-zp(l-g)+77(0) [l-/i(0) Hzg^-zg^p-zg-zg^p] + 7/(0)[zp-l] - zgp tt(Q: 
1-zp(1-P)+Z7t(0) { -Pp“1+p(1-P) }+z^7t(0) [ 1-7t(0) ]{ -p(l-p)+2p6'l-P) J+z^v (C 

In a similar meinner one can show that 



<I>^(z) 

l-zp(l-6) + 7r(l; ]( zg^-zp^p-zg-zP^p) + tt( 1) [ zp-1] + zpprr(l)^ 

l-zp(l-p)-fz7^(l){-^p-l+p(l-p)]+z^7^(l)[l-^r(l)l-^(l-p)+2p^(l-p)]+z^7r(l'^0p 

Higher moments for the runs can be obtained from the generating 
functions. These are important in determining to what degree the binary' 
DARMA.(l,l) process differs from Markov models, e.g. the DAR(i) model, 
the differentiation considered here being to what extent the distributic.: 
of run lengths departs from a geometric distribution. 
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6. TELEPHONE ERROR DATA 



We discuss here very briefly a case in which the binary DA.RMA(i, 1) 
model may be of use; in particular, we do this to illustrate some of the 
formulas given in the previous section. Another recent example of the 
need for discrete time-series models is Gaver, Lavenberg and Price (l976) 

Cox and Lewis (l966, p. 175) discussed data consisting of errors 

which occurred during the transmission of binary data over a telephone 

line. Let X = 1 indicate that the nth transmitted bit is in error, 
n 

while = 0 indicates that no error occurred. Models which postulate 
that bit errors occur independently or according to a Markov chain such 
as the binary DAR(i) process predict that the runs of ones and runs of 
zero will both have geometric distributions. But the nins of zeros are 
just the intervals (or number of bits) between errors without the times 
consisting of 1 bit between errors, and these were shown by Berger and 
Mandelbrot (l965) and Lewis and Cox (l966) to be nongeometric. In fact 
they are highly skewed, long-tailed distributions which led Berger and 
Mandelbrot (l965) to postulate a model in which intervals between errors 
were assumed to be independent with Pareto-type distributions. 

The problem with the Berger-Mandelbrot model was that the 
intervals between errors were found to be dependent (Lewis and Cox, 1966) 
Moreover there are a disproportionate number of 1-bit between error 
intervals; 128 out of 673 intervals, while the longest interval between 
errors is 85,995 bits. This suggests that modelling the binary X^ 
process may be a better approach than modelling the intervals, although 
the modelling process must be nonMarkovian. 
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The binary DARIvlAil^lj process is a candidate process for 
modelling this process; in particular one would like to know whether 
the runs of zeros for p between zero i^Markovian) and one (independent 
produce highly-skewed run-length distributions. The question is too bro^id 
to be considered here, involving also estimation of suitable values 
of p and p, and will be considered elsewhere. Here we will examine 
only the effect of p on E(Tq) and the plausibility of the model. 

For the error data, 672 bits out of 1, 106, l48 transmitted were 
in error, so that we can estimate 7t(1) as 

u(i) = 1 - t( 0) = = 0.0006C75 . 

Thus from (3.4) with p = 0 we compute that the expected lengths of 
runs of zeros and ones, given that they occur, are respectively 
1/7t(1) = 1,645.09 and l/n'(0) = I.OOO608; the observed values from the 
data are 1,911.27 and 1.235, both much longer than predicted xmder 
independence assumption (p =0). 

In Table 1 we give values of E(Tq) computed from the formula 

(5.3). 

Note that for small p, e.g. p = 0.1, the value of 2 (Tq' 
first increases with increasing P, and then decreases. This is character- 
istic behavior for the process when it is almost a moving average. For 
large p, we find E(Tq) decreasing with p. In particular p has a 
large effect on E(Tq); it remains to be seen how p effects the whole 
distribution of runs. 
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The estimated first and second serial correlation coeffl' -.ents 
for the data^ p(li and p(2) are O.l^O and 0.121 respectively. This 
is consistent with ihe models restriction to positive serial correlation 
coefficients. From the expressions for Siven in 

Section 5 we see that p(2)/p(l) = 0.64 is a rough estimate for 
thus p is relatively large for this data. With the proviso tha^ 
is relatively large it might he possible to find unique values of 
and p which> with the estimated tt(i); would make (5-f^ and (5-4 ■ 
equal to the estimated E(T^). i'fhis is not possible in 

general since the expressions (5-3) and (5 A) are not single-valued 
functions of p for small fixed p. ) An alternative is to use the 
estimate of p and E(T^), with tt(o) and p(l); in (5-3 "nd sol\re 
for p. The rough estimat obtained this way is p = O.bl. 

It appears to be possible to estimate p and p in a more 
systematic way using higher order joint moments of the Tnis 

will be discussed elsewhere. 
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